{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "635d8ebb",
   "metadata": {},
   "source": [
    "# Runnable Parallel\n",
    "\n",
    "- Author: [Jaemin Hong](https://github.com/geminii01)\n",
    "- Peer Review: [ranian963](https://github.com/ranian963), [Jinu Cho](https://github.com/jinucho)\n",
    "- Proofread : [Chaeyoon Kim](https://github.com/chaeyoonyunakim)\n",
    "- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)\n",
    "\n",
    "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/13-LangChain-Expression-Language/05-RunnableParallel.ipynb)[![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/13-LangChain-Expression-Language/05-RunnableParallel.ipynb)\n",
    "## Overview\n",
    "\n",
    "This tutorial covers ```RunnableParallel```, a core component of the LangChain Expression Language(LCEL).\n",
    "\n",
    "```RunnableParallel``` is designed to execute multiple Runnable objects in parallel and return a mapping of their outputs.\n",
    "\n",
    "This class delivers the same input to each Runnable, making it ideal for running independent tasks concurrently. Moreover, we can instantiate ```RunnableParallel``` directly or use a dictionary literal within a sequence.\n",
    "\n",
    "### Table of Contents\n",
    "\n",
    "- [Overview](#overview)\n",
    "- [Environment Setup](#environment-setup)\n",
    "- [Handling Input and Output](#handling-input-and-output)\n",
    "- [Using itemgetter as a Shortcut](#using-itemgetter-as-a-shortcut)\n",
    "- [Understanding Parallel Processing Step-by-Step](#understanding-parallel-processing-step-by-step)\n",
    "- [Parallel Processing](#parallel-processing)\n",
    "\n",
    "### References\n",
    "\n",
    "- [RunnableParallel](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableParallel.html)\n",
    "- [itemgetter](https://docs.python.org/3/library/operator.html#operator.itemgetter)\n",
    "- [FAISS](https://python.langchain.com/docs/integrations/vectorstores/faiss/#setup)\n",
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c6c7aba4",
   "metadata": {},
   "source": [
    "## Environment Setup\n",
    "\n",
    "Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.\n",
    "\n",
    "**[Note]**\n",
    "- ```langchain-opentutorial``` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials. \n",
    "- You can check out the [```langchain-opentutorial```](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "21943adb",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture --no-stderr\n",
    "%pip install langchain-opentutorial"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "f25ec196",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Install required packages\n",
    "from langchain_opentutorial import package\n",
    "\n",
    "package.install(\n",
    "    [\n",
    "        \"langsmith\",\n",
    "        \"langchain_community\",\n",
    "        \"langchain_core\",\n",
    "        \"langchain_openai\",\n",
    "        \"faiss-cpu\",\n",
    "    ],\n",
    "    verbose=False,\n",
    "    upgrade=False,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "7f9065ea",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Environment variables have been set successfully.\n"
     ]
    }
   ],
   "source": [
    "# Set environment variables\n",
    "from langchain_opentutorial import set_env\n",
    "\n",
    "set_env(\n",
    "    {\n",
    "        \"OPENAI_API_KEY\": \"\",\n",
    "        \"LANGCHAIN_API_KEY\": \"\",\n",
    "        \"LANGCHAIN_TRACING_V2\": \"true\",\n",
    "        \"LANGCHAIN_ENDPOINT\": \"https://api.smith.langchain.com\",\n",
    "        \"LANGCHAIN_PROJECT\": \"05-RunnableParallel\",\n",
    "    }\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "690a9ae0",
   "metadata": {},
   "source": [
    "You can alternatively set API keys such as ```OPENAI_API_KEY``` in a ```.env``` file and load them.\n",
    "\n",
    "[Note] This is not necessary if you've already set the required API keys in previous steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "4f99b5b6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Load API keys from .env file\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "load_dotenv(override=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa00c3f4",
   "metadata": {},
   "source": [
    "## Handling Input and Output\n",
    "\n",
    "```RunnableParallel``` is useful for manipulating the output of one Runnable within a sequence to match the input format requirements of the next Runnable.\n",
    "\n",
    "Let's suppose a prompt expects input as a map with keys (```context``` , ```question```).\n",
    "\n",
    "The user input is simply the question, providing content. Therefore, you'll need to use a retriever to get the context and pass the user input under the ```question``` key."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "69cb77da",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"Teddy's occupation is an AI engineer.\""
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain_community.vectorstores import FAISS\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
    "from langchain_core.prompts import ChatPromptTemplate\n",
    "from langchain_core.runnables import RunnablePassthrough\n",
    "from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
    "\n",
    "# Create a FAISS vector store from text\n",
    "vectorstore = FAISS.from_texts(\n",
    "    [\"Teddy is an AI engineer who loves programming!\"], embedding=OpenAIEmbeddings()\n",
    ")\n",
    "\n",
    "# Use the vector store as a retriever\n",
    "retriever = vectorstore.as_retriever()\n",
    "\n",
    "# Define the template\n",
    "template = \"\"\"Answer the question based only on the following context:\n",
    "{context}\n",
    "\n",
    "Question: {question}\n",
    "\"\"\"\n",
    "\n",
    "# Create a chat prompt from the template\n",
    "prompt = ChatPromptTemplate.from_template(template)\n",
    "\n",
    "# Initialize the ChatOpenAI model\n",
    "model = ChatOpenAI(model=\"gpt-4o-mini\")\n",
    "\n",
    "# Construct the retrieval chain\n",
    "retrieval_chain = (\n",
    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
    "    | prompt\n",
    "    | model\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "# Execute the retrieval chain to obtain an answer to the question\n",
    "retrieval_chain.invoke(\"What is Teddy's occupation?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8a05c0e7",
   "metadata": {},
   "source": [
    "Note that type conversion is handled automatically when configuring ```RunnableParallel``` with other Runnables. We don't need to manually wrap the dictionary input provided to the ```RunnableParallel``` class.\n",
    "\n",
    "The following three methods present different initialization approaches that produce the same result:\n",
    "\n",
    "```python\n",
    "# Automatically wrapped into a RunnableParallel\n",
    "1. {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
    "\n",
    "2. RunnableParallel({\"context\": retriever, \"question\": RunnablePassthrough()})\n",
    "\n",
    "3. RunnableParallel(context=retriever, question=RunnablePassthrough())\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54d8197d",
   "metadata": {},
   "source": [
    "## Using itemgetter as a Shortcut\n",
    "\n",
    "Python’s ```itemgetter``` function offers a shortcut for extracting specific data from a map when it is combined with ```RunnableParallel``` .\n",
    "\n",
    "For example, ```itemgetter``` extracts specific keys from a map."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "6c8569e6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"Teddy's occupation is an AI engineer.\""
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from operator import itemgetter\n",
    "\n",
    "from langchain_community.vectorstores import FAISS\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
    "from langchain_core.prompts import ChatPromptTemplate\n",
    "from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
    "\n",
    "# Create a FAISS vector store from text\n",
    "vectorstore = FAISS.from_texts(\n",
    "    [\"Teddy is an AI engineer who loves programming!\"], embedding=OpenAIEmbeddings()\n",
    ")\n",
    "# Use the vector store as a retriever\n",
    "retriever = vectorstore.as_retriever()\n",
    "\n",
    "# Define the template\n",
    "template = \"\"\"Answer the question based only on the following context:\n",
    "{context}\n",
    "\n",
    "Question: {question}\n",
    "\n",
    "Answer in the following language: {language}\n",
    "\"\"\"\n",
    "\n",
    "# Create a chat prompt from the template\n",
    "prompt = ChatPromptTemplate.from_template(template)\n",
    "\n",
    "# Construct the chain\n",
    "chain = (\n",
    "    {\n",
    "        \"context\": itemgetter(\"question\") | retriever,\n",
    "        \"question\": itemgetter(\"question\"),\n",
    "        \"language\": itemgetter(\"language\"),\n",
    "    }\n",
    "    | prompt\n",
    "    | ChatOpenAI(model=\"gpt-4o-mini\")\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "# Invoke the chain to answer the question\n",
    "chain.invoke({\"question\": \"What is Teddy's occupation?\", \"language\": \"English\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13ff52df",
   "metadata": {},
   "source": [
    "## Understanding Parallel Processing Step-by-Step\n",
    "\n",
    "Using ```RunnableParallel``` can easily run multiple Runnables in parallel and return a map of their outputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "57fd269f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'capital': 'The capital of the United States is Washington, D.C.',\n",
       " 'area': 'The total area of the United States is approximately 3.8 million square miles (about 9.8 million square kilometers). This includes all 50 states and the District of Columbia. If you need more specific details or comparisons, feel free to ask!'}"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain_core.prompts import ChatPromptTemplate\n",
    "from langchain_core.runnables import RunnableParallel\n",
    "from langchain_openai import ChatOpenAI\n",
    "\n",
    "# Initialize the ChatOpenAI model\n",
    "model = ChatOpenAI(model=\"gpt-4o-mini\")\n",
    "\n",
    "# Define the chain for asking about capitals\n",
    "capital_chain = (\n",
    "    ChatPromptTemplate.from_template(\"Where is the capital of the {country}?\")\n",
    "    | model\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "# Define the chain for asking about areas\n",
    "area_chain = (\n",
    "    ChatPromptTemplate.from_template(\"What is the area of the {country}?\")\n",
    "    | model\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "# Create a RunnableParallel object to execute capital_chain and area_chain in parallel\n",
    "map_chain = RunnableParallel(capital=capital_chain, area=area_chain)\n",
    "\n",
    "# Invoke map_chain to ask about both the capital and area\n",
    "map_chain.invoke({\"country\": \"United States\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50044834",
   "metadata": {},
   "source": [
    "The following example explains how to execute chains that have different input template variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "4963d0a3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'capital': 'The capital of the Republic of Korea (South Korea) is Seoul.',\n",
       " 'area': 'The total area of the United States is approximately 3.8 million square miles (about 9.8 million square kilometers). This includes all 50 states and the District of Columbia.'}"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Define the chain for asking about capitals\n",
    "capital_chain2 = (\n",
    "    ChatPromptTemplate.from_template(\"Where is the capital of the {country1}?\")\n",
    "    | model\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "# Define the chain for asking about areas\n",
    "area_chain2 = (\n",
    "    ChatPromptTemplate.from_template(\"What is the area of the {country2}?\")\n",
    "    | model\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "# Create a RunnableParallel object to execute capital_chain2 and area_chain2 in parallel\n",
    "map_chain2 = RunnableParallel(capital=capital_chain2, area=area_chain2)\n",
    "\n",
    "# Invoke map_chain with specific values for each key\n",
    "map_chain2.invoke({\"country1\": \"Republic of Korea\", \"country2\": \"United States\"})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7ca493e0",
   "metadata": {},
   "source": [
    "## Parallel Processing\n",
    "\n",
    "```RunnableParallel``` is particularly useful for running independent processes in parallel because each Runnable in the map is executed concurrently.\n",
    "\n",
    "For example, you can see that ```area_chain```, ```capital_chain```, and ```map_chain``` take almost the same execution time, even though ```map_chain``` runs the other two chains in parallel."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "85759f35",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.49 s ± 208 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "# Invoke the chain for area and measure execution time\n",
    "area_chain.invoke({\"country\": \"United States\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "19e11763",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "860 ms ± 195 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "# Invoke the chain for capital and measure execution time\n",
    "capital_chain.invoke({\"country\": \"United States\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "204373b0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.65 s ± 379 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "# Invoke the chain constructed in parallel and measure execution time\n",
    "map_chain.invoke({\"country\": \"United States\"})"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "langchain-kr-lwwSZlnu-py3.11",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
